-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cli): support optimistic stabilization strategy(--exit-on-config-complete) #29536
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pull request linter has failed. See the aws-cdk-automation comment below for failure reasons. If you believe this pull request should receive an exemption, please comment and provide a justification.
A comment requesting an exemption should contain the text Exemption Request
. Additionally, if clarification is needed add Clarification Request
to a comment.
@Mergifyio update |
✅ Branch has been successfully updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This integration test is failing:
● deploy with optimistic stabilization
--
Test Suites: 1 failed, 2 passed, 3 total
Could you fix the failure? I can't find the actual error in the logs, it just says "exited with error code 1"
I will get it fixed next week. :-) |
➡️ PR build request submitted to A maintainer must now check the pipeline and add the |
packages/aws-cdk/lib/cli.ts
Outdated
@@ -255,7 +256,8 @@ async function parseCommandLineArguments(args: string[]) { | |||
.command('destroy [STACKS..]', 'Destroy the stack(s) named STACKS', (yargs: Argv) => yargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to also add an option for cdk watch
??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I am wrong but AFAIK cdk watch
essentially monitors a stack and update it through SDK calls without redeploying it. I think cdk watch
would not monitor the stack status so I can't see how this would fit in this scenario. I might be wrong though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I thinkk we're both right. It's an options to turn off the --hot-swap for watch
https://docs.aws.amazon.com/cdk/v2/guide/cli.html#cli-deploy.
This comment is just a suggestion, though. I would approve this PR with or without watch support, since that could be added in the future, if there was a demand for it
}, | ||
], | ||
}); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A high level question -- what if a resource fails deploying after CONFIGURATION_COMPLETE status is emitted? Do we handle that case? Is that a possible case? If it is a possible case, could we have an integration test for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will discuss this with the CFN team. As this PR is to monitor CONFIGURATION_COMPLETE
for the stack detailed status, not the resources. It could be a risk that stack enters CONFIGURATION_COMPLETE
detailed status but its final status is failed. In this case CDK CLI would assume the stack deployment is completed but actually is failed. We need to check:
- Is this possible for CFN stacks to happen?
- If that happens, what is the best DX for CDK users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking is, if that happens, we should fail with exit status 1. One idea, without much thinking about how this would work, is to poll the statuses of all the stacks once the final stack has entered CONFIG_COMPLETE. So then it's just that the final stack of the deployment is not getting the speedup... which I think is fine since you then get the speed up on all the intermediate stacks.
I am considering if there is a case where a customer kicks off some sort of production thing as soon as the customer get a success signal from CDK deploy? If so, then we'd want to make sure that everything did successfully provision... unless CloudFormation handles that, somehow. I'm not too familiar with this new optimistic stabilization feature from CloudFormation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should figure out, let's say if we have 2 stacks to deploy with Stack2 depends on Stack1:
Stack1 enters CC
state -> CLI starts deploying Stack2 -> Stack2 enters CC
state -> CLI exit 0 but Stack1 ended up with final status failed. Now we get boost from Stack1 and Stack2 but Stack1 is actually failed. What should we do? If we just check the final status of the last stack, it would be success but actually Stack1 has failed and probably would be rolling back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preconception (which could be wrong) is that stack 2 would fail if stack 1 eventually failed. But if the scenario you're describing is true, then that biases me in favor of including a warning about this not being for production, and that would be all we'd do for addressing the failure case. So, similar to how we advise users to not use hotswap in production, we'd advise users to not use this flag in production. What do you think of that? I personally feel satisfied with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check with the CFN team:
- Will stack2 fail when stack1 fails with
CC
status
a. if stack2 would fail then we just need to check the status of the last stack
b. if stack2 would NOT fail we will end up with a failed stack1 and a deployed stack2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I agree with @bergjaak to warn customer to not use the flag in production. This is what CFN official user guide mentions.
Stack 2 will not fail automatically. unless CDK implement a pulling mechanism that read the state an fail the stack based on the stack 1 state.
…ub.com/pahud/aws-cdk into pahud/cdk-cli-execute-deployment-29443
The pull request linter fails with the following errors:
PRs must pass status checks before we can provide a meaningful review. If you would like to request an exemption from the status checks or clarification on feedback, please leave a comment on this PR containing ✅ A exemption request has been requested. Please wait for a maintainer's review. |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
The CDK should not take this feature. See this comment for further details: #29443 (comment) |
@@ -444,6 +444,36 @@ class LambdaHotswapStack extends cdk.Stack { | |||
} | |||
} | |||
|
|||
class DetailedStatusStack extends cdk.Stack{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
none of these resources currently cause the stack itself to emit the CONFIGURATION_COMPLETE
event, so this test needs to be updated. The only resource I know of that does make it emit this event is ECS Services
The pull request linter fails with the following errors:
PRs must pass status checks before we can provide a meaningful review. If you would like to request an exemption from the status checks or clarification on feedback, please leave a comment on this PR containing ✅ A exemption request has been requested. Please wait for a maintainer's review. |
Update: CDK can't support this feature at this moment because CFN would not emit the outputs in We will evaluate this again when CFN has this support. |
This PR adds an
--exit-on-config-complete
flag forcdk deploy
. When this flag is enabled, CDK will monitor theDetailedStatus
of the stack being deployed, and return success when the stack enters theCONFIGURATION_COMPLETE
state.This optimistic stabilization strategy can help reduce the total deployment time, especially when working with multiple stacks that have dependencies on each other. Normally, the CDK would wait for the entire deployment process to complete before returning, even if individual stacks had finished. With the
--exit-on-config-complete
flag, CDK will return as soon as each stack is in theCONFIGURATION_COMPLETE
state, allowing subsequent dependent stacks to continue their deployment immediately. This can lead to significant time savings for complex CDK applications with many interdependent stacks.Issue # (if applicable)
Closes #29443
Reason for this change
Description of changes
When
--exit-on-config-complete
is provided withcdk deploy
,waitForStackDeploy()
monitors the DetailedStatus forCONFIGURATION_COMPLETE
and returns success upon it.Description of how you validated changes
Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license